Non-Parametric Kernel Learning with robust pairwise constraints
نویسندگان
چکیده
For existing kernel learning based semi-supervised clustering algorithms, it is generally difficult to scale well with large scale datasets and robust pairwise constraints. In this paper, we proposed a new Non-Parametric Kernel Learning framework (NPKL) to deal with these problems. We generalized the graph embedding framework into kernel learning, by reforming it as a semi-definitive programming (SDP) problem, smoothing and avoiding over-smoothing the functional Hilbert space with Laplacian regularization. We proposed two algorithms to solve this problem. One is a straightforward algorithm using semidefinite programming (SDP) to solve the original kernel learning problem, dented as TRAnsductive Graph Embedding Kernel learning (TRAGEK); the other is to relax the SDP problem and solve it with a constrained gradient descent algorithm. To accelerate the learning speed, we further divide the data into groups and used the sub-kernels of these groups to approximate the whole kernel matrix. This algorithm is denoted as Efficient Non-PArametric Kernel Learning (ENPAKL). The advantages of the proposed NPKL framework are 1) supervised information in the form of pairwise constraints can be easily incorporated; 2) it is robust to the number of pairwise constraints, i.e., the number of constraints does not affect the running time too much; 3) ENPAKL is efficient Changyou Chen Research School of Information Sciences and Engineering, The Australian National University, Canberra, Australia E-mail: [email protected] This work was done when I was at Fudan University, Shanghai, China Junping Zhang Shanghai Key Laboratory of Intelligent Information Processing and School of Computer Science, Fudan University, Shanghai, China Tel.: +123-45-678910 Fax: +123-45-678910 E-mail: [email protected] Xuefang He School of Software and Information Engineering, Beihai College of Beihang University E-mail: [email protected] Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China Tel.: +123-45-678910 Fax: +123-45-678910 E-mail: [email protected] 2 Changyou Chen et al. to some extent compared to some related kernel learning algorithms since it is a constraint gradient descent based algorithm. Experiments for clustering based on the learned kernels show that the proposed framework scales well with the size of datasets and the number of pairwise constraints. Further experiments for image segmentation indicate the potential advantages of the proposed algorithms over the traditional k-means and N -cut clustering algorithms for image segmentation in term of segmentation accuracy.
منابع مشابه
A Family of Simple Non-Parametric Kernel Learning Algorithms
Previous studies of Non-Parametric Kernel Learning (NPKL) usually formulate the learning task as a Semi-Definite Programming (SDP) problem that is often solved by some general purpose SDP solvers. However, for N data examples, the time complexity of NPKL using a standard interiorpoint SDP solver could be as high as O(N6.5), which prohibits NPKL methods applicable to real applications, even for ...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملPairwise Exemplar Clustering
Exemplar-based clustering methods have been extensively shown to be effective in many clustering problems. They adaptively determine the number of clusters and hold the appealing advantage of not requiring the estimation of latent parameters, which is otherwise difficult in case of complicated parametric model and high dimensionality of the data. However, modeling arbitrary underlying distribut...
متن کاملتشخیص سرطان پستان با استفاده از برآورد ناپارمتری چگالی احتمال مبتنی بر روشهای هستهای
Introduction: Breast cancer is the most common cancer in women. An accurate and reliable system for early diagnosis of benign or malignant tumors seems necessary. We can design new methods using the results of FNA and data mining and machine learning techniques for early diagnosis of breast cancer which able to detection of breast cancer with high accuracy. Materials and Methods: In this study,...
متن کاملSemi-supervised clustering with metric learning: An adaptive kernel method
Most existing representative works in semi-supervised clustering do not sufficiently solve the violation problem of pairwise constraints. On the other hand, traditional kernel methods for semi-supervised clustering not only face the problem of manually tuning the kernel parameters due to the fact that no sufficient supervision is provided, but also lack a measure that achieves better effectiven...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Machine Learning & Cybernetics
دوره 3 شماره
صفحات -
تاریخ انتشار 2012